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Abstract 

We present a new approach to automated reasoning about 
higher-order programs by extending symbolic execution to 
use behavioral contracts as symbolic values, enabling sym- 
bolic approximation of higher-order behavior. 

Our approach is based on the idea of an abstract reduc- 
tion semantics that gives an operational semantics to pro- 
grams with both concrete and symbolic components. Sym- 
bolic components are approximated by their contract and our 
semantics gives an operational interpretation of contracts-as- 
values. The result is a executable semantics that soundly pre- 
dicts program behavior, including contract failures, for all 
possible instantiations of symbolic components. We show 
that our approach scales to an expressive language of con- 
tracts including arbitrary programs embedded as predicates, 
dependent function contracts, and recursive contracts. Sup- 
porting this feature-rich language of specifications leads to 
powerful symbolic reasoning using existing program asser- 
tions. 

We then apply our approach to produce a verifier for 
contract correctness of components, including a sound and 
computable approximation to our semantics that facilitates 
fully automated contract verification. Our implementation 
is capable of verifying contracts expressed in existing pro- 
grams, and of justifying valuable contract-elimination opti- 
mizations. 



1. Behavioral contracts as symbolic values 

Whether in the context of dynamically loaded JavaScript 
programs, low-level native C code, widely-distributed li- 
braries, or simply intractably large code bases, automated 
reasoning tools must cope with access to only part of the 
program. To handle missing components, the omitted por- 
tions are often assumed to have arbitrary behavior, greatly 
limiting the precision and effectiveness of the tool. 

Of course, programmers using external components do 
not make such conservative assumptions. Instead, they at- 
tach specifications to these components, often with dynamic 
enforcement. These specifications increase their ability to 
reason about programs that are only partially known. But 
reasoning solely at the level of specification can also make 



verification and analysis challenging as well as requiring 
substantial effort to write sufficient specifications. 

The problem of program analysis and verification in the 
presence of missing data has been widely studied, producing 
many effective tools that apply symbolic execution to non- 
deterministically consider many or all possible inputs. These 
tools typically determine constraints on the missing data, and 
reason using these constraints. Since the central lesson of 
higher-order programming is that computation is data, we 
propose symbolic execution of higher-order programs for 
reasoning about systems with omitted components, taking 
specifications to be our constraints. 

Our approach to higher-order symbolic execution there- 
fore combines specification-based symbolic reasoning about 
opaque components with semantics-based concrete reason- 
ing about available components; we characterize this tech- 
nique as specifications as values. As specifications, we adopt 
higher-order behavioral software contracts. Contracts have 
two crucial advantages for our strategy. First, they provide 
benefit to programmers outside of verification, since they 
automatically and dynamically enforce their described in- 
variants. Because of this, modem languages such as C#, 
Haskell and Racket come with rich contract libraries which 
programmers already use lfT51 [TTl l22l . Rather than requir- 
ing programmers to annotate code with assertions, we lever- 
age the large body of code that already attaches contracts at 
code boundaries. For example, the Racket standard library 
features more than 4000 uses of contracts ETIl . Second, the 
meaning of contracts as specifications is neatly captured by 
their dynamic semantics. As we shall see, we are able to 
turn the semantics of contract systems into tools for verifi- 
cation of programs with contracts. Verifying contracts holds 
promise both for ensuring correctness and improving perfor- 
mance: in existing Racket code, contract checks take more 
than half of the running time for large computations such 
as rendering documentation and type checking large pro- 
grams [39]. 

Our plan is as follows: we begin with a review of con- 
tracts in the setting of Contract PCF [ 12 1 (f|2]). Next, we ex- 
tend Contract PCF with abstract values described by spec- 
ifications, producing a core model of symbolic execution 
for our language of higher-order contracts, which we dub 
Symbolic PCF with Contracts (jj3). This allows us to give 
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(def ine-contract list/c 

(rec/c X (or/c empty? (cons/c nat? X)))) 
(module opaque 

(provide 

[insert (nat? (and/c list/c sorted?) 

-> (and/c list/c sorted?))] 
[nums list/c]))) 
(module insertion-sort 
(require opaque) 
(define (foldl fib) 
(if (empty? 1) b 

(foldl f (cdr 1) (f (car 1) b)))) 
(define (sort 1) (foldl insert 1 empty)) 
(provide 
[sort 

(list/c -> (and/c list/c sorted?))])) 
> (sort nums) 

(• (and/c list/c sorted?)) 

i 

Figure 1. Verification of insertion sort 



non-deterministic behavior to programs in which any num- 
ber of modules are omitted, represented only by their spec- 
ifications; here given as contracts. We accomplish this by 
treating contracts as abstract values, with the behavior of 
any of their possible concrete instantiations. 

Contracts as abstract values provides a rich domain of 
symbols, including precise specifications for abstract higher- 
order values. These values present new complications to 
soundness, addressed with a demonic context, a universal 



context for discovering blame for behavioral values ({ 3.5 i. 

We then extend this core calculus to a model of pro- 
grams with modules — including opaque modules whose im- 
plementation is not available — and a much richer contract 
language (<Q, modeling the functional core of Racket |fl9ll . 
We show that our symbolic execution strategy soundly scales 
up from Symbolic PCF to this more complex language while 
preserving its advantages in higher-order reasoning. More- 
over, the technique of describing symbolic values with con- 
tracts becomes even more valuable in an untyped setting. 

As the modular semantics is uncomputable, this verifica- 
tion strategy is necessarily incomplete. To address this, we 
apply the technique of abstracting abstract machines B2l 
to derive first an abstract machine and then a computable 
approximation to our semantics directly from our reduc- 
tion system ({p). We then turn our semantics into a tool 
for program verification which is integrated into the Racket 
toolchain and IDE (fQ. Users can click a button and explore 
the behavior of their program in the presence of opaque mod- 
ules, either with a potentially non-terminating semantics, or 
with a computable approximation. Finally, we consider the 
extensive prior work in symbolic execution, verification of 



specifications, and analysis of higher-order programs (fj7| 
and conclude. 

Our semantics allows us to use contracts for verification 
in two senses: to verify that programs do not violate their 
contracts, and verifying rich properties of programs by ex- 
pressing them as contracts. In fact, the semantics alone is, 
in itself, a program verifier. The execution of a modular pro- 
gram which runs without contract errors on any path is a ver- 
ification that the concrete portions of the program never vi- 
olate their contracts, no matter the instantiation of the omit- 
ted portions. This technique is surprisingly effective, partic- 
ularly in systems with many layers, each of which use con- 
tracts at their boundaries. For example, the implementation 
of insertion sort in figure [T] is verified to live up to its con- 
tract, which states that it always produces a sorted list. This 
verification works despite the omitted insert function, used 
in higher-order fashion as an argument to foldl. 

Contributions We make the following contributions: 

1. We propose abstract reduction semantics as technique 
for higher-order symbolic execution. This is a variant of 
operational semantics that treats specifications as values, 
to enable modular reasoning about partially unknown 
programs. 

2. We give an abstract semantics for a core typed functional 
language with contracts that equips symbolic values rep- 
resented as sets of contracts with an operational interpre- 
tation, allowing sound reasoning about opaque program 
components with rich specifications by soundly predict- 
ing program behavior for all possible instantiations of 
those opaque components. We then scale this semantics 
up to model a more realistic untyped language with mod- 
ules and an expressive set of contract combinators. 

3. We derive a sound and computable program analysis 
based on our semantics that can serve as the basis for 
automated program verification, optimization, and static 
debugging. 

4. We provide a prototype implementation of an interactive 
verification environment based on our theoretical models 
which successfully verifies existing programs with con- 
tracts. 

2. Contracts and Contract PCF 

The basic building block of our specification system is 
behavioral software contracts. Originally introduced by 
Meyer [31 1, contracts are executable specifications that sit at 
the boundary between software components. In a first-order 
setting, properly assessing which component violated a con- 
tract at run-time is straightforward. Matters are complicated 
when higher-order values such as functions or objects are in- 
cluded in the language. Findler and Felleisen [ 17 1 introduced 
the notion of blame and established a semantic framework 
for properly assessing blame at run-time in a higher-order 
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language, providing the theoretical basis for contract sys- 
tems such as Racket's. 



(module double 

(provide [dbl ((even? -> even?) 

-> (even? -> even?))]) 

(define dbl (A (f) (A (x) (f (f x)))))) 
> (dbl (A (x) 7)) 

top-level broke the contract on dbl; 
expected <even?>, given: 7 



To illustrate, consider the program above, which consists 
of a module and top-level expression. Module double pro- 
vides a dbl function that implements twice-iterated applica- 
tion, operating on functions on even numbers. The top-level 
expression makes use of the dbl function, but incorrectly — 
dbl is applied to a function that produces 7. 

Contract checking and blame assignment in a higher- 
order program is complicated by the fact that is not decid- 
able in general whether the argument of dbl is a function 
from and to even numbers. Thus, higher-order contracts are 
pushed down into delayed lower-order checks, but care must 
be taken to get blame right. In our example, the top-level 
is blamed, and rightly so, even though even? witnesses the 
violation when f is applied to x while executing dbl. 

2.1 Contract PCF 

Dimoulas et al. fl2\ [PH introduce Contract PCF as a core 
calculus for the investigation of contracts, which we take as 
the starting point for our model. CPCF extends PCF ll33l 
with contracts for base values and first-class functions. We 
provide a brief recap of the syntax and semantics of CPCF. 

Contracts for flat values, flat(_E), employ predicates that 
may use the full expressive power of CPCF. Function con- 
tracts, C\ M> C 2 consist of a pre-condition contract C\ for 
the argument to the function and a post-condition contract 
C 2 for the function's result. Dependent function contracts, 
C\ M> XX. C 2 , bind X to the argument of the function in the 
post-condition contract C 2 , and thus express an dependency 
between a functions input and result. In the remainder of the 
paper, we treat the non-dependent function contract C\ H> C2 
as shorthand for C\ i-> XX. C2 where X is fresh. 

A contract C is attached to an expression with the monitor 
construct mon/ 1 ' s (C, E), which carries three labels: /, g, and 
h, denoting the names of components. (An implementation 
would synthesize these names from the source code.) The 
monitor checks any interaction between the expression and 
its context is in accordance with the contract. 

Component labels play an important role in case a con- 
tract failure is detected during contract checking. In such a 
case, blame is assigned with the blame^ construct, which 
denotes the component named / broke its contract with g. 



PCF with Contracts 



Types 


T : 


:= B\T -> T | con(T) 


Base types 


B : 


:= N | B 


Terms 


E : 


:= A \ X \ E E \ [iX: T.E | if E E E 
| 1 (E)\0 2 (E,E)\mon f /(C,E) 


Operations 


Oi : 


:= zero? | false? | . . . 




02 : 


1 1 1 A 1 \ / 1 

:= + | - 1 A 1 V | ... 


Contracts 


C : 


:= flat(£) | C^rC \ C^XX:T.C 


Answers 


A : 


:= V | £ [blame£] 


Values 


V : 


:= XX:T.E | | 1 | - 1 | ... | tt | ff 


Evaluation 


£ : 


:=[]\£E\V£\0 2 (£,E)\0 2 (V,£) 


contexts 




| O x (£) | if £EE | mon f h ' 9 (C,£) 



Semantics for PCF with Contracts 



Ei 



E' 



if tt Ex E 2 
if ff Ei E 2 
(KX-.T.E) V 
fiX-.T.E 

0(V) 



Ei 
E 2 

[V/X]E 
[fiX:T.E/X]E 
A if 6(0, V) = A 



mon 



f,g 



h (Ci^\X:T.C 2 ,V) .— > 

AX:T.mon{< 9 (C 2 , (V monf [C u X))) 



mon£' s (fiat(£;),y) 1 — > if (E V) V blame 



CPCF is equipped with a standard type system for PCF 
plus the addition of a contract type con(T), which denotes 
the set of contracts for values of type T |[T2l [131 . The type 
system is straightforward, so for the sake of space, we defer 



the details to an appendix ({ A. 1 



The semantics of CPCF are given as a call-by-value re- 
duction relation on programs. One-step reduction is written 
as E 1 — > E' and defined as the above relation, contextually 
closed over evaluation contexts £ . The reflexive transitive 
closure of one-step reduction is written E 1 — » E'. 

The first five cases of the reduction relation are stan- 
dard for PCF. The remaining two cases implement con- 
tract checking for function contracts and flat contracts, re- 
spectively. The monitor of a function contract on a func- 
tion reduces to a function that monitors its input with re- 
versed blame labels and monitors its output with the original 
blame labelsQThe monitor of a flat contract reduces to an 
if-expression which tests whether the predicate holds. If it 
does, the value is returned. If it doesn't, a contract error is 
signaled with the appropriate blame. 

1 For brevity, we have presented the so-called lax dependent contract rule, 
although our implementation uses indy \ 131 . which is obtained by replacing 
the right-hand side with: 

XX:T.mon f h ' 9 ([mon f h M (Ci,X)/X]C 2 ,V mon 9 h J ' (d,X)). 
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Symbolic PCF with Contracts 



Prevalues U 



XX-.T.E 10 11 



tt | ff 



Values 



\ ■■■ I {(■■■■■) 



Semantics for Symbolic PCF with Contracts E i — > E' 

' if V Ei E 2 i — > Ex if tf(true?, V) 9 tt ' 

ifV.E1.E2 i — ► E 2 if <S(true?, V) 9 ff 
{\X:T.E) V i — ► [V/X]E 

fiX:T.E i — ► [^X:T.E/A:]E 
O(V) i— > A if 6(0, T?) 9 A 
(. T ^ T '/C) V .— > /{[V/X]C 2 I Ci^XX:T.C 2 G C} 
/C) V i — > havoc T V 



,T— >T 



3. Symbolic PCF with Contracts 

We now describe an extension to Contract PCF that enriches 
the language with symbolic values, drawn from the language 
of contracts, and show the revised semantics. The basic idea 
of SCPCF is to take the values of CPCF as "pre"-values U 
and add a notion of an unknown values (of type T), written 
"• T ". Purely unknown values have arbitrary behavior, but 
we will refine unknowns by attaching a set of contracts that 
specify an agreement between the program and the missing 
component. Such refinements can guide an operational char- 
acterization of a program. Pre-values are refined by a set of 
contracts to form a value, U/C, where C ranges over sets of 
contracts. 

The high-level goal of the following semantics is to en- 
able the running of programs with unknown components. 
The main requirement is that the results of running such 
computations should soundly and precisely approximate the 
result of running that same program after replacing an un- 
known with any allowable value. More precisely, if a pro- 
gram involving some value V produces an answer A, then 
abstracting away that value to an unknown should produce 
an approximation of A: 



if£\V] 



A and h V 



T, then £\» T ] 



A', 



where A' "approximates" A in way that is made formal in 
section [ 



Notation: Abstract (or synonymously: symbolic) values V 
range over values of the form » T /C Whenever the refine- 
ment of a value is irrelevant, we omit the C set. We write 
V ■ C for U/C U {C} where V = U/C. 

The semantics given above replace that of section [2] 
equipping the operational semantics with an interpretation 
of symbolic values. (The semantics of contract checking is 
deferred for the moment.) 

To do so requires two changes: 



1. the 5 relation must be extended to interpret operations 
when applied to symbolic values, and 

2. the one-step reduction relation must be extended to the 
case of (1) branching on a (potentially) symbolic value, 
and (2) applying a symbolic function. 

3.1 Operations on symbolic values 

Typically, the interpretation of operations is defined by a 
function S that maps an operation and argument values to 
an answer. So for example, <5(addl,0) = 1. The result 
of applying a primitive may either be a value in case the 
operation is defined on its given arguments, or blame in case 
it is not. 

The extension of 6 to interpret symbolic values is largely 
straightforward. It starts by generalizing S from a function 
from an operation and values to an answer, to a relation 
between operations, values, and answers (or equivalently, to 
a function from an operation and values to sets of answers). 
This enables multiple results when a symbolic value does not 
convey enough information to uniquely determine a single 
result. For example, (5(zero?, 0) = {tt}, but <5(zero?, » N ) = 
{tt, ff}. From here, all that remains is adding appropriate 
clauses to the definition of S for handling symbolic values. 
As an example, the definition includes: 



6(+,Vi,V 2 ) 9 » N ,if Vi orV 2 



7c 



The remaining cases are similarly straightforward. 

The revised reduction relation reduces an operation, non- 
deterministically, to any answer in the S relation. 

3.2 Branching on symbolic values 

The shift from the semantics of section |2] to section [3] in- 
volves what appears to be a cosmetic change in the reduction 
of conditionals, e.g., from 



if tt E x E 2 



Ex 



to 



if V Ex E% i — > Ex if <J(true?, V) 3 tt. 



In the absence of symbolic values, the two relations are 
equivalent, but once symbolic values are introduced, the 
latter handles branching on potentially symbolic values by 
deferring to 5 to determine if V is possibly true. Conse- 
quently branching on » B results in both Ex and E 2 since 
<5(true?,» B ) = {tt,ff}. Without this slight refactoring for 
conditionals, additional cases for the reduction relation are 
required, and these cases would largely mimic the existing 
if reductions. By reformulating in terms of i5, we enable the 
uniform reduction of abstract and concrete values. 

3.3 Applying symbolic functions 

When applying a symbolic function, the reduction relation 
must take two distinct possibilities into account. The first 
is that the argument to the symbolic function escapes, but 
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no failure occurs in the unknown context, so the function 
returns an abstract value refined by the range contracts of 
the function. The second is that the use of V in a unknown 
context results in the blame of V. To discover if blaming V 
is possible we rely upon a havoc function, which iteratively 
explores the behavior of V for possible blame. Its only 
purpose is to uncover blame, thus it never produces a value; 
it either diverges or blames V. In this simplified model, the 
only behavioral values are functions, so we represent all 
possible uses of the escaped value by iteratively applying it 
to unknown values. This construction represents a universal 
"demonic" context to discover a way to blame V if possible, 
and we have named the function havoc to emphasize the 
analogy to Boogie's havoc function lfl4l . which serves the 
same purpose, but in a first-order setting. 

The havoc function is indexed by the type of its argu- 
ment. At base type, values do not have behavior, so havoc 
simply produces a diverging computation. At function type, 
havoc produces a function that applies its argument to an ap- 
propriately typed unknown input value and then recursively 
applies havoc at the result type to the output: 

havocs = M x - X 

havocT_j.jv = Ax:T — > T'.havocT'( x,T ) 

To see how havoc finds all possible errors in a term, 
consider the following function guarded by a contract: 

mon(Af : N — > N.sqrt (f 0), (anyH> any) i-> any) 

where any is the trivial contract flat(Ax.tt) and sqrt has the 
type N — > IM and contract flat( positive?) H> flat(positive?). 
If we then apply havoc to this term at the appropriate type, it 
will supply the input » N ^ N for f . When this abstract value is 
applied to 0, it reduces to both a diverging term that produces 
no blame, and the symbolic number » N . Finally, sqrt is 
applied to » N , which both passes and fails the contract check 
on sqrt, since » N represents both positive and non-positive 
numbers; the latter demonstrates the original function could 
be blamed. 

In contrast, if the original term was wrapped in the con- 
tract (any h-> flat(positive?)) H> any, then the abstract value 
# N_>N would have been wrapped in the contract any h> 
flat(positive?). When the wrapped abstract function is ap- 
plied to 0, it then produces the more precise abstract value 
• N • flat(positive?) as the input to sqrt and fails to blame the 
original function. 

The ability of havoc to find blame if possible is key to our 
soundness result. 

3.4 Contract checking symbolic values 

We now turn to the revised semantics for contract checking 
reductions in the presence of symbolic values. The key ideas 
are that we 

1. avoid checking any contracts which a value provably 
satisfies, and 



2. add flat contracts to a value's refinement set whenever a 
contract check against that value succeeds. 

To implement the first idea, we add a reduction rela- 
tion which sidesteps a contract check and just produces the 
checked value whenever the value proves it satisfies the 
contract. To implement the second idea, we revise the flat 
contract checking reduction relation to produce not just the 
value, but the value refined by the contract in the success 
branch of a flat contract check. 

Contract checking, revisited 

' mon{' 9 {C,V) h— > V if \-V:CS ' 

mon/; 9 (flat(£:),y) i— > 

if (E V) {V ■ flat(£:)) blame£ if f- V : flat(-E) / 

mon f h ' 9 {d^\X :T.C 2 , V) i — > 

\X:T.mon% B (C 2 ,V mon 3 / (d, X)) 
i i 

The judgment h V : C/ denotes that V provably satis- 
fies the contract C, which we read as "V proves C." Our 
system is parametric with respect to this provability relation 
and the precision of the symbolic semantics improves as the 
power of the proof system increases. For concreteness, we 
consider the following simple, yet useful proof system which 
asserts a symbolic value proves any contract it is refined by: 

C eC 
h V/C : C7 

As we will see subsequently, this relation can easily be 
extended to handle more sophisticated reasoning. 

Taken together, the revised contract checking relation and 
proves relation allow values to remember contracts once 
they are checked and to avoid rechecking in subsequent 
computations. Consider the following program with abstract 
pieces: 

let keygen = mon(uniti-^flat(prime?), •) 

rsa = mon (flat (prime?) H> (any i— >any), •) 
in rsa (keygen ()) "Plaintext" 

When invoking keygen produces an abstract number, it 
will be checked against the prime? contract, which will 
non-deterministically both succeed and fail, since keygen's 
source is not available to be verified. However, in the case 
where the check succeeds, the prime? contract is remem- 
bered, meaning that our semantics correctly predicts that the 
top level application does not break rsa's contract by provid- 
ing a composite number. This verifies that regardless of the 
implementation of keygen and rsa, which may themselves be 
buggy, their composition is verified to uphold its obligations. 

3.5 Soundness 

Soundness relies on the definition of approximation between 
terms. We write E C E' to mean E' approximates E, or 
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conversely E refines E'. The basic intuition for approxima- 
tion is an abstract value, which can be thought of as stand- 
ing for a set of acceptable concrete values, approximates a 
concrete value if that value is in the set the abstract value 
denotes. 

Since the » T value stands for any value of type T, we 
have the following axiom: 

h V : T 

(In subsequent judgments, we assume both sides of the ap- 
proximation relation are typable at the same type and thus 
omit type annotations and judgments.) A monitored expres- 
sion is approximated by its contract: 



mon(C,£;) E •• C 

To handle the approximation of wrapped functions, we em- 
ploy the following rule, which matches the right-hand side 
of the reduction relation for the monitor of a function: 



(AXmon(D, (V mon(C, X)))) C • • C^XX.D 
Arbitrary contract refinements may be introduced as follows: 

V E V 

V-C^V V-CQV'-C 
A contract may be eliminated when a value proves it: 

h V : CV 
V E V ■ C 

If an expression approximates a monitored expression, it's 
OK to monitor the approximating expression too: 

mon(C,E) E E' 
mon(C,E) E mon{C,E') 

Finally, E is reflexively, transitively, and compatibly closed. 

Theorem 1 (Soundness of Symbolic PCF with Contracts). 

If E E E' and E i — » A, then there exists some A 1 such 
that E' i — » A' where A E A'. 

Proof. (Sketch) The proof follows from (1) a completeness 
result for havoc, which states that if £[V] i — » £'[blame f ] 
where I is not in £ , then havoc V i — » £" [blame ], and (2) 
the following main lemma: if E\ i — > E[ ^ £ [blame^], and 
Ei E E 2 , then E 2 i — » E' 2 and E[ E E' 2 , which is in turn 
proved reasoning by cases on E\ i — > E[ and appealing to 
auxiliary lemmas that show approximation is preserved by 
substitution and primitive operations. The full proof for the 
enriched system of section|4]is given in appendix|B] □ 





:= ME 


P,Q : 


M,N : 


:= (module f C V) 


E, E' : 


:= f \X\A\EE e \\fEEE\OE e \nX.E 




| mon f f ' f (C,E) 


U : 


:= n | tt | ff | (XX.E) \ • \ (V,V) \ empty 


V : 


:= U/C 


C,D : 


:= X | C^XX.C | flat(E) 




| (C,C) | C V C | C A C | fiX.C 


: 
o? : 
A : 


:= addl car cdr cons + = o? ... 


:= nat? | bool? | empty? | cons? | proc? | false? 


:= V | £[blame^] 


Figure 2. Syntax of Symbolic Core Racket 



The soundness result achieves the high-level goal stated 
at the beginning of this section: we have constructed an ab- 
stract reduction semantics for the sound symbolic execu- 
tion of programs such that their symbolic execution approx- 
imates the behavior of programs for all possible instantia- 
tions of the opaque components. In particular, we can verify 
pieces of programs by running them with missing compo- 
nents, refined by contracts. If the abstract program does not 
blame the known components, no context can cause those 
components to be blamed. 

4. Symbolic Core Racket 

Having developed the core ideas of our symbolic executor 
for programs with contracts, we extend our language to an 
untyped core calculus of modular programs with data struc- 
tures and rich contracts. This forms a core model of a re- 
alistic programming language, Racket [19|. In addition to 
closely modeling our target language, omitting types places 
a greater burden on the contract system and symbolic execu- 
tor. As we see in this section, ours is up to the job. 

To SCPCF we add pairs, the empty list, and related oper- 
ations; contracts on pairs; recursive contracts; and conjunc- 
tive and disjunctive contracts. Predicates, as before, are ex- 
pressed as arbitrary programs within the language itself. Pro- 
grams are organized as a set of module definitions, which as- 
sociate a module name with a value and a contract. Contracts 
are established at module boundaries and here express an 
agreement between a module and the external context. The 
contract checking portion of the reduction semantics moni- 
tors these agreements, maintaining sufficient information to 
blame the appropriate party in case a contract is broken. 

4.1 Syntax 

The syntax of our language is given in figure [2] We write 
E for a possibly-empty sequence of E, and treat these se- 
quences as sets where convenient. Portions highlighted in 
gray are the key extensions over SCPCF, as presented in ear- 
lier sections. 
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A program P consists of a sequence of modules followed 
by a main expression. Modules are second-class entities that 
name a single value along with a contract to be applied to 
that value. Opaque modules are modules whose body is •. 
Expressions now include module references, labeled by the 
module they appear in; this label is used as the negative 
party for the module's contract. Applications are also la- 
beled; this label is used if the application fails. Pair values 
and the empty list constant are standard, along with their 
operations. Since the language is untyped, we add standard 
type predicates such as nat?. 

The new contract forms include pair contracts, with the 
obvious semantics, conjunction and disjunction of contracts, 
and recursive contracts with contract variables. 

Contract checks mon^' s (C, E), which will now be in- 
serted automatically by the operational semantics, take all 
of their labels from the names of modules, with the third la- 
bel h represents the module in which the contract originally 
appeared. As before, / represents the positive party to the 
contract, blamed if the expression does not meet the con- 
tract, and g is the negative party, blamed if the context does 
not satisfy its obligations. Whenever these annotations can 
be inferred from context, we omit them; in particular, in the 
definition of relations, it is assumed all checks of the form 
mon(C, E) have the same annotations. We omit labels on 
applications whenever they provably cannot be blamed, e.g. 
when the operand is known to be a function. 

A blame expression, blame^,, now indicates that the mod- 
ule (or the top-level expression) named by I broke its con- 
tract with £', which may be a module or the language, indi- 
cated by A in the case of primitive errors. 

Syntactic requirements: We make the following assump- 
tions of initial, well-formed programs, P: programs are 
closed, every module reference and application is labeled 
with the enclosing module's name, or f if in the top-level 
expression, operations are applied with the correct arity, ab- 
stract values only appear in opaque module definitions, and 
no monitors or blame expressions appear in the source pro- 
gram. 

We also require that recursive contracts be productive, 
meaning either a function or pair contract constructor must 
occur between binding and reference. We also require that 
contracts in the source program are closed, both with respect 
to A-bound and contract variables. Following standard prac- 
tice, we will say that a contract is higher-order if it syntac- 
tically contains a function contract; otherwise, the contract 
is flat. Flat contracts can be checked immediately, whereas 
higher-order contracts potentially require delayed checks. 
All predicate contracts are necessarily flat. 

Disjunction of contracts: For disjunctions, we require that 
at most one of the disjuncts is higher-order and without loss 
of generality, we assume it is the right disjunct. The reason 
for this restriction is that we must choose at the time of 
the initial check of the contract which disjunct to use — 



we cannot just try both because higher-order checks must 
be delayed. In Racket, disjunction is therefore restricted to 
contracts that are distinguishable in a first-order way, which 
we simplify to the restriction that only one can be higher- 
order. 

4.2 Reductions 

Evaluation is modeled with one-step reduction on programs, 
P i — > Q. Since the module context consists solely of syn- 
tactic values, all computation occurs by reduction of the top- 
level expression. Thus program steps are defined in terms 
of top-level expression steps, carried out in the context of 
several module definitions. We model this with a reduction 
relation on expressions in a module context, which we write 
M h E i — > E' , We omit the the module context where it 
is not used and write E i — > E' instead. Our reduction sys- 
tem is given with evaluation contexts, which are identical to 
those of SCPCF in section^ 

We present the definition of this relation in several parts. 

4.2.1 Applications, operations, and conditionals 

First, the definition of procedure applications, conditionals, 
primitive operations is as usual for a call-by-value language. 
Primitive operations are interpreted by a S relation (rather 
than a function), just as in section [3] The reduction relation 
for these terms is defined as follows: 



Basic reductions 



E 



E' 



{{\x.E)vy h- 


-> [V/X)E 


i 


(vv'Y 


-> blame^ 


if 5(proc?,F) 3 ff 


(ovy h- 


-> A 


if 5(O e ,V) 3 A 


if V E E' h- 


-> E 


if a(false?,V) 3 ff 


WV EE' i — 

■ 


-> E' 


if 5(false?,V) 3 tt 



Again, we rely on 5 not only to interpret operations, but 
also to determine if a value is a procedure or ff; this allows 
uniform handling of abstract values, which may (depending 
on their remembered contracts) be treated as both true and 
false. We add a reduction to blame^ when applications are 
misused; the program has here broken the contract with the 
language, which is no longer checked statically by the type 
system as it was in SCPCF. Additionally, our rules for if 
follow the Lisp tradition (which Racket adopts) in treating 
all non-ff values as true. 

4.2.2 Basic operations 

Basic operations, as with procedures and conditionals, fol- 
low SCPCF closely. Operations on concrete values are stan- 
dard, and we present only a few selected cases. Operations 
on abstract values are more interesting. A few selected cases 
are given in figure [3] as examples. Otherwise, the definition 
of 5 is for concrete values is standard and we relegate the 
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Primitive operations (concrete values) S(O e , V) 3 A Flat contract reduction 



mon(C*,y) i — > E' 



<5(addl,n) 3 n + 1 
<$(+, n, m) 3 n + m 
<5(car, (V, V')) 3 V 
J(cdr,(V,y')) 3 V 



Primitive operations (abstract values) 



h V : 


: o?/ 


h V : 


: o?X 


h y ; 


: o?? 


h V : 


: nat?/ 


h K : 


: nat?X 


h K : 


: nat?? 


h V : 


: cons?/ 


h V : 


: cons?/ 


h V : 


: cons? ? 



S(o?, V) 3 tt 
5(o?, V) 9 ff 

5(o?,T0 3 •/{flat(bool?)} 
<J(addl,y) 9 «/{flat(nat?)} 
5(addl e ,V) 3 blamef ddl 
5(addl,y) 9 •/flat(nat?) 
A 5(addl £ ,y) 9 blamef ddl 
5(car,y) 9 7ri(F) 
<5(car £ ,y) 9 blamed 
<5(car, V) 3 7ri(y) 
A 6(car £ , y) 9 blamef ar 



Figure 3. Basic operations 

When applying base operations to abstract values, the re- 
sults are potentially complex. For example, addl • might 
produce any natural number, or it might go wrong, depend- 
ing on what value • represents. We represent this in 5 with 
a combination of non-determinism, where 6 relates an oper- 
ation and its inputs to multiple answers, as well as abstract 
values as results, to handle the arbitrary natural numbers or 
booleans that might be produced. A representative selection 
of the 5 definition for abstract values is presented in figure[3] 

The definition of S relies on a proof system relating 
predicates and values, just as with contract checking. Here, 
h V : o? / means that V is known to satisfy o?, h V : o? X 
means that V is known not to satisfy o?, and h V : o? ? 
means V neither is known. For example, h 7 : nat? /, 
h tt : cons? X, and h • : o? ? for any o?. (Again, our system 
is parametric with respect to this proof system, although we 
present a useful instance in section [43] ) 

Finally, if no case matches, then an error is produced: 

S(O e ,V) 3 blame! 

Labels on operations come from the application site of the 
operation in the program, e.g. addl 5 e so that the appro- 
priate module can be blamed when primitive operations are 
misused, as in the last case, and are omitted whenever they 
are irrelevant. When primitive operations are misused, the 
violated contract is on A, standing for the programming 
language itself, just as in the rule for application of non- 
functions. 

4.2.3 Module references 

To handle references to module-bound variables, we define 
a module environment that describes the module context M, 



mon 



mon 



f,a 

h 

J,a 
f,g 



(C,V) 
(C,V) 
(C,V) 



v-c 

blame 



if Cis flat and h V : CV 
if C is flat and h V : CX 



if (fc(C) V) {V ■ C) blame£ 

if C is flat and h V : C? 



Flat contract checking 



fc(C) = E 



FC(^A.C) = nX.¥C{C) 

fc(A) = X 
FC(flat(S)) = E 
FC(d A C 2 ) = Ay.(FC(d) y) A (fc(C 2 ) y) 
FC(C a V C 2 ) = Ay.(Fc(C 1 ) y) V (fc(C 2 ) y) 
FC((CiA)) = 

Ay.(and (cons? y) (FC(Ci) (cary)) (fc(C 2 ) (cdry))) 
i i 

Figure 4. Flat contracts 

Using the module reference annotation, the environment dis- 
tinguishes between self references and external references. 
When an external module is referenced (/ ^ g), its value 
is wrapped in a contract check; a self-reference is resolved 
to its (unchecked) value. This distinction implements the no- 
tion of "contracts as boundaries" ifTTl . in other words, con- 
tracts are an agreement between the module and its context, 
and the module can behave internally as it likes. 



Module references 



MY- p> 



E 



M h ff 
Mh/s 



V 



J,9( 



mony^C, V) 



mony J (C, 



if (module f C V) € M 
if (module f C V) e M 
C) if (module f C •) G M 



4.2.4 Contract checking 

With the basic rules handled, we now turn to the heart of the 
system, contract checking. As in section|3] as computation is 
carried out, we can discover properties of values that may be 
useful in subsequently avoiding spurious contract errors. Our 
primary mechanism for remembering such discoveries is to 
add properties, encoded as contracts, to values as soon as the 
computational process proves them. If a value passes a flat 
contract check, we add the checked contract to the value's 
remembered set. Subsequent checks of the same contract are 
thus avoided. We divide contract checking reductions into 
two categories, those for flat contracts and those for higher- 
order contracts, and consider each in turn. 

Flat contracts: First, checking flat contracts is handled by 
three rules, presented in figure |4] depending on whether the 
value has already passed the relevant contract. 

The first two rules consider the case where the value 
definitely does pass the contract, written h V : C / ("V 
proves C"), or does not pass, written h V : C X ("V refutes 
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C"). If neither of these is the case, written h V : C ?, the 
third rule implements a contract check by compiling it to 
an if-expression. The test is an application of the function 
generated by FC(C) to V. If the test succeeds, V ■ C is 
produced. Otherwise, the positive party, here /, is blamed 
for breaking the contract on h. 

The three judgments checking the relation between values 
and contracts are a simple proof system; by parameterizing 
over these relations, we enable our system to make use of 
sophisticated existing decision procedures. For the moment, 
the key property is that h V ■ C : C J holds, just as in sec- 
tion|3.4| and further details are discussed in section 14. 31 



Compiling flat checks to predicates: The FC metafunc- 
tion, also in figure |4] takes a flat contract and produces 
the source code of a function which when applied to a 
value produces true or false indicating whether the value 
passes the contract. The additional complexity over the 
similar rules of sections [2] and [3] handles the addition of 
flat contracts containing recursive contracts, disjunctive and 
conjunctive contracts, and pair contracts. In particular, to 
check disjunctive contracts, we must test if the left dis- 
junct passes the contract, and conditionalize on the re- 
sult, whereas our earlier reduction rules for flat contracts 
simply fail for contracts that don't pass. As an exam- 
ple, the check expression mon^' 9 (flat(nat?), V) reduces to 
if (nat? V) V blame{, but using this reduction to check 
mon(flat(nat?) V flat(bool?),tt) would cause a blame er- 
ror when checking the left disjunct, which is obviously not 
the intended result. Instead, the rules for FC generate the 
check if (nat? tt) tt (mon(flat(bool?), tt)), which succeeds 
as desired. 

Higher-order contracts: The next set of reduction rules, 
presented in figure [5] defines the behavior of higher-order 
contract checks; we assume for these rules that the checked 
contract is not flat. 

In the first rule, we again use the //-expansion tech- 
nique pioneered by Findler and Felleisen ifTTl to decom- 
pose a higher-order contract into subcomponents. This rule 
only applies if the contracted value V is indeed a func- 
tion, as indicated by proc? (In SCPCF, this side-condition 
is avoided thanks to the type system). Otherwise, the second 
rule blames the positive party of a function contract when 
the supplied value is not a function. 

The remaining rules handle higher-order contracts that 
are not immediately function contracts, such as pairs of 
function contracts. The first two are for pair contracts. If 
the value is determined to be a pair by cons?, then the 
components are extracted using car and cdr and checked 
against the relevant portions of the contract. If the value is 
not a pair, then the program reduces to blame, analogous to 
the case for function contracts. 

The last set of rules decompose combinations of higher- 
order contracts. Recursive contracts are unrolled. (Produc- 
tivity ensures that contracts do not unroll forever.) Conjunc- 



Function contract reduction 



mon(C,V) i — > E' 



mon f h ' 9 {C^\X.D,V) i — ► 

(AX.mon[' s (D, (V mon 9 /(C, X)))) 
if 5(proc?,V) 9 tt 

mon{' 9 (C^\X.D, V) i — > b\ame{ if 5(proc?, V) 3 ff 
i i 

Other higher-order contract reductions 

'mon((C,D),V) ~ ' 
(cons mon(C, car V) mon(D, cdr V')) 
if £(cons?, V) 3 tt and V = V ■ flat(cons?) 

blame{ if <5(cons?, V) 3 ff 

mon([fj,X.C/X]C, V) 

mon(£>, mon(C, V)) 

if (FC(C) V) (V-C) mon(D,V) 
if h V : CI 

V i£hV:C/ 

mon(D,V) if h V : C* 



mon£ 9 «C,D},F) 
mon(/xXC, V) 
mon{C AD,V) 
mon(CvD,V) 

mon(CV D,V) 
mon(C'V D,V) 



Figure 5. Higher-order contract reduction 

tions are split into their components, with the left checked 
before the right. For higher-order disjunctions, we rely on 
the invariant that only the right disjunct is higher-order and 
use FC for the check of the left. When possible, we omit the 
generation of this check by using the proof system as de- 
scribed above (in the final two rules). 

4.2.5 Applying abstract values 

Again, application of abstract values poses a challenge, just 
as it in in section [3] We now must explore more possible 
behaviors of abstract operators, and we no longer have types 
to guide us. Fortunately, abstract values also give us the tools 
to express the needed computation. 



Applying abstract values E 



E' where V 



,/C 



VV i— > •/{[V/X]D | (Ch>AX£>) e C} 

if <5(proc?,y) 3 tt 
VV i — ► havoc V if <f (proc?, y) 9 tt 

havoc = /iy.(Ax.AMB({y (x»),y (carx),y (cdrx)})) 

amb({£}) = E 

AMB({£/, Ei, • • • }) = if • £AMBpi,...}) 



The behavior of abstract values, which are created by 
references opaque modules, is handled in much the same 
way as in SCPCF. When an abstract function is applied, there 
are again two possible scenarios: (1) the abstract function 
returns an abstract value or (2) the argument escapes into 
an unknown context that causes the value to be blamed. We 
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again make use of a havoc function for discovering if the 
possibility of blame exists. In contrast to the typed setting of 
SCPCF, we need only one such value. The demonic context 
is a universal context that will produce blame if it there exists 
a context that produces blame originating from the value. If 
the universal demonic context cannot produce blame, only 
the range value is produced. 

The havoc function is implemented as a recursive func- 
tion that makes a non-deterministic choice as to how to treat 
its argument — it either applies the argument to the least- 
specific value, •, or selects one component of it, and then 
recurs on the result of its choice. This subjects the input 
value to all possible behavior that a context might have. Note 
that the demonic context might itself be blamed; we implic- 
itly label the expressions in the demonic context with a dis- 
tinguished label and disregard these spurious errors in the 
proof of soundness. We use the AMB metafunction to imple- 
ment the non-determinism of havoc; AMB uses an if test of 
an opaque value, which reduces to both branches. 

4.3 Proof system 

Compared to the very simple proof system of section [34] the 
system for proving or refuting whether a given value satisfies 
a contract in Core Racket is more sophisticated, although the 
general principles remain the same. 

In particular, we rely on three different kinds of judg- 
ments that relate values and contracts: proves, refutes, and 
neither. The first, h V : C / includes the original judgment 
that a value proves a contract if it remembers that contract. 
Additionally, we add judgements for reasoning about type 
predicates in the language. For example if a value is known 
to satisfy a particular base predicate, written h V : o? /, 
then the value satisfies the contract flat(o?). This relies on 
the relation between values and predicates used above in the 
definition of S, which is defined in a straightforward way. 

The refutes relation is more interesting and relies on addi- 
tional semantic knowledge, such as the disjointness of data 
types. For instance, a value that remembers it is a proce- 
dure, refutes all pair contracts and the pair? predicate con- 
tract. Other refutes judgments are straightforward based on 
structural decomposition of contracts and values. 

The complete definition of these relations is given in ap- 
pendix [A4] Our implementation, described in section [6] in- 
corporates a richer set of rules for improved reasoning. The 
implementation is naive but effective for basic semantic rea- 
soning, however it essentially does no sophisticated reason- 
ing about base type domains such as numbers, strings, or 
lists. The tool could immediately benefit from leveraging an 
external solver to decide properties of concrete values. 

4.4 Improving precision via non-determinism 

Since our reduction rules, and in particular the S relation, 
make use of the remembered contracts on values, making 
these contracts as specific as possible improves precision of 
the results. 



Improving precision via non-determinism V i — > V 

•/CU{CiVC 2 } h-> ./CU{CJ ie{l,2} 

•/C U {nX.C} i— > t/CU {{fiX.C/X]C} 
i i 

The two rules above increase the specificity of abstract val- 
ues. The first splits abstract values known to satisfy a dis- 
junctive contract. For example, •/{flat(nat?)Vflat(bool?)} h 
• /flat(nat?) and •/flat(bool?). This converts the impreci- 
sion of the value into non-determinism in the reduction rela- 
tion, and makes subsequent uses of 5 more precise on the two 
resulting values. Similarly, we unfold recursive contracts in 
abstract values; this exposes further disjunctions to split, as 
with a contract for lists. 

As an example of the effectiveness of this simple ap- 
proach, consider the list length function: 

(module length 

(provide [len (list/c -> nat?)]) 
(define len 
(A (1) 

(if (empty? 1) (+ 1 (len (cdr 1))))))) 
When applied to the symbolic value 

• • /xx.(flat(empty?) V (nat?,x)) 

which is the definition of list/c, we immediately unroll and 
split the abstract value, meaning that we evaluate the body of 
len in exactly the two cases it is designed to handle, with a 
precise result for each. Without this splitting, the test would 
return both tt and ff, and the semantics would attempt to 
take the cdr of the empty list, even though the function will 
never fail on concrete inputs. This provides some of the 
benefits of occurrence typing [41] simply by exploiting the 
non-determinism of the reduction semantics. 

4.5 Evaluation and Soundness 

We now define evaluation of entire modular programs, and 
prove soundness for our abstract reduction semantics. One 
complication remains. In any program with opaque mod- 
ules, any module might be referenced, and then treated ar- 
bitrarily, by one of the opaque modules. While this does not 
affect the value that the main expression might reduce to, it 
does create the possibility of blame that has not been pre- 
viously predicted. We therefore place each concrete mod- 
ule into the previously-defined demonic context and non- 
deterministically choose one of these expressions to run 
prior to running the main module of the program. 
The evaluation function is defined as: 

eval(ME) = {E' \M h E'; E\ — » E'}, 

where E' = AMB({tt, havoc /}), (module / C V) € M. 

Soundness, as in section [33] relies on the definition of 
approximation between terms, and its straightforward exten- 
sion to modules and programs. (The details of the approxi- 
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mation relation for Symbolic Core Racket are given in ap- 
pendix [B]) 

Theorem 2 (Soundness of Symbolic Core Racket). 

If P C. Q where Q = ME and A £ eval(P), then there 
exists some A' G eval(Q) where A A'. 

This soundness result, proved in appendix [B] implies that 
if a program with opaque modules does not produce blame, 
then the known modules cannot be blamed, regardless of the 
choice of implementation for the opaque modules. 

Corollary 1. If f is the name of a concrete module in P, 
and P ^— » £[blameg], then no instantiation of the opaque 
modules in P can cause f to be blamed. 



5. Convergence and decidability 

At this point, we have constructed an abstract reduction se- 
mantics that gives meaning to programs with opaque com- 
ponents. The semantics is a sound abstraction of all possi- 
ble instantiations of the omitted components, thus it can be 
used to verify modular programs satisfy their specifications. 
However, in order to automatically verify programs, the se- 
mantics must converge for the particular program being an- 
alyzed. 

We now describe how to refactor the semantics in such 
a way that we can accelerate and — if desired — guarantee 
convergence by introducing further orthogonal approxima- 
tion into the semantics. This is accomplished in a number of 

ways: 

• basic operations may widen when applied to concrete 
values, 

• environment structure may be bounded, and 

• control structure may be bounded. 

In order to guarantee convergence for all possible pro- 
grams, all three of these forms of approximation must be em- 
ployed and are sufficient to guarantee decidability of the se- 
mantics. However, our experience suggests that such strong 
convergence guarantees may be unnecessary in practice. For 
example, we have found that a simple widening of concrete 
recursive function applications to their contract when ap- 
plied to abstract values works well for ensuring convergence 
of tail-recursive programs broken into small modules. By 
adding a limited form of control structure approximation, 
we are able to automatically verify non-tail-recursive func- 
tions. Taken together, these forms of approximation do not 
guarantee convergence in general, yet they do induce con- 
vergence for all of the examples we have considered (see 



\ 6.2 1 and with fewer spurious results compared to more tra- 



ditional forms of abstraction such as OCFA and pushdown 
flow analysis. But rather than advocate a particular approxi- 
mation strategy, we now describe how to refactor the seman- 
tics so that all these choices may be expressed. Our imple- 
mentation (Sj6]l then makes it easy to explore any of them. 



5.1 Widening values 

Widening the results of basic operations, i.e., those inter- 
preted by 8, can cut down the set of base values, and if nec- 
essary can ensure finiteness of base values. Thus, we replace 
8 with 8': 



S'(0,V) 3 widen(F) 



5{0,V) 3 V 



where widen represents an arbitrary choice of a metafunc- 
tion for mapping a value to its approximation. To avoid ap- 
proximation, it is interpreted as the identity function. To 
ensure finite base values, it must map to a finite range; a 
simple example is widen (V) = • for all V. An example 
of a more refined interpretation is widen(?i) = flat(nat?), 
widen (cons V U) = flat(cons?), etc. For soundness, we re- 
quire that widen (V) = U implies V C U. 

5.2 Bounding environment structure 

The lexical environment of a program represents a source of 
unbounded structure. To enable approximation of the lexical 
environment, we first refactor the semantics as calculus of 
explicit substitutions |11| with a global store. Substitutions 
are modeled by finite maps from variables to addresses and 
the store maps addresses to sets of values, which are now 
represented as closures: 



0",? 



p[X i y a] 

o[a^{V,...}} 



Reductions that bind variables, such as function application, 
must allocate and extend the environment. For example, 



{XX. E) V £ 



[V/X]E 



becomes an analogous reduction relation on closures and 
stores: 



((XX.E),p) V e ,a 



(E,p[X i — y a]),aU [oh V] 
where a = alloc((7, X) 



and the interpretation of a U [a i-> V] is a' s.t. <r'(b) = a(b) 
if a ^ b and a 1 (a) = a (a) U {V}. 

Since reduction operates over closures, there is an addi- 
tional case needed to handle variable references: 



(X,p),a 



V,aifV G <j(p(X)) 



The alloc metafunction provides a point of control which 
regulates the approximation of environment structure. To 
ensure finite environment approximation, the metafunction 
must map to a finite set of addresses (for a fixed program). 
A simple finite abstraction is alloc((7, X) = for all a and 
X. This abstraction maps all bindings to a single location, 
thus conflating all bindings in a program. Although highly 
imprecise, this is a sound approximation, and in fact any in- 
stantiation of alloc is sound |42|. A more refined abstrac- 
tion is alloc(cr, X) — X, which provides a finite abstraction 
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of environment structure similar to OCFA in which multi- 
ple bindings of the same variable are conflated. To avoid ap- 
proximation, alloc(er, X) should chose a fresh address not in 
a. Consequently, the store maps all addresses to singleton 
sets of values and the environment-based reduction seman- 
tics corresponds precisely with the original. 

5.3 Bounding control structure 

The remaining source of unbounded structure stems from the 
control component of a program. Bounding the environment 
structure was achieved by (1) making substitutions explicit 
as environments and (2) threading environments through a 
store which could be bounded. An analogous approach is 
taken for control: (1) evaluation contexts are explicated as 
continuations and (2) continuations are threaded through the 
store. 

The resulting semantics is an abstract machine that oper- 
ates over a triple comprised of a closure, a store, and a con- 
tinuation. Transitions take three forms: decomposition steps, 
which search for the next redex and push continuations if 
needed; plug steps which return a value to a context and pop 
continuations if needed; and contraction steps, which imple- 
ment the s notion of reduction. 

We write continuations k as single evaluation context 
frames with embedded addresses representing (a pointer to) 
the surrounding context. So for example E a represents E £ 
where a points to a continuation representing £. A simple 
decompose case is: 

((EE',p),a,K) .— > ((E,p),aU[a^K],a(E',p)) 
where a = alloc(er, k) 

Notice that this transition searches for the next redex in 
the left side of an application and allocates a pointer to 
the given context in order to push on the argument to be 
evaluated later. Similar to variable binding, continuations 
are allocated using alloc and joined in the store, allowing 
for multiple continuations to reside in a single location. The 
corresponding plug rule pops the current continuation frame 
and non-deterministically chooses a continuation: 

(V,a,a(E',p)) i — > (V (E',p), cr, re) where re 3 a(a) 

The contraction rule simply applies the reduction relation on 
explicit substitutions: 

((E,p),a, K ) .— > ((E',q)^,k) 
if ((E,p), a) .— > ((£',f>),?) 

To ensure a finite approximation of control, alloc must 
map to a finite set of addresses. A simple finite abstraction is 
alloc(ft, a) = 0. A more refined finite abstraction of control 
is to use a frame abstraction: alloc(P a, a) = E [] and like- 
wise for other continuation forms. To avoid approximation, 
alloc(ft, a) should produce an address not in ex. 

We have now restructured our semantics as a machine 
model with three distinct points of control over approxima- 
tion: basic operations, environments, and control; the full 



definition of the machine is given in appendix | A. 5 1 We now 
establish the correspondence between the previous reduction 
semantics and the machine model when no approximation 
occurs. Let i — >cesk denote the machine transition relation 
under the exact interpretations of widen and alloc. Let U be 
the straightforward recursive "unload" function that maps a 
closure and store to the closed term it represents. 

Lemma 1 (Correspondence). If P i — » Q, then there exists 
<; such that (P, 0, 0) i — »cesk C Midlife) = Q. 

We now relate any approximating variant of the machine 
to its exact counterpart. Let i — ^cesk denote the machine 
transition under any sound interpretation of widen. We de- 
fine an abstraction map as a structural abstraction of the 
state-space of the exact machine to its approximate coun- 
terpart. The key case is on stores: 

a (a) = \a. | |{a(er(a))} 

a(a)—d 

The C relation is lifted to machine states as the natural point- 
wise, element-wise, component-wise and member-wise lift- 
ing. 

Theorem 3 (Soundness). If q i — >cesk and afe) C <f, 
then there exists q' such that <f i — > cesk an< ^ — ^ • 

We have now established any instantiation of the machine 
is a sound approximation to the exact machine, which in turn 
correspond to the original reduction semantics. Furthermore, 
we can prove decidability of the semantics for finite instan- 
tiations of widen and alloc: 

Theorem 4 (Decidability). If widen and alloc have finite 
range for a program P, then (P. 0,0) i — » gg^g- ? is 
decidable for any <;. 

The proofs of these theorems closely follow those given 
by Van Horn and Might [42 1 and are deferred to appendix [B] 

6. Implementation 

To validate our approach, we have implemented a prototype 
interactive program verification environment, as seen in fig- 
ure]^] We can take the example from section[2] define the rel- 
evant modules, and explore the behavior of different choices 
for the main expression. 

Programs are written with the #lang var <options> 
header, where <options> range over a visualization mode: 
trace, step, or eval; a model mode: term or machine; 
and an approximation mode: approx, exact, or user. 
Following the header, programs are written in a subset of 
Racket, consisting of a series of module definitions and a 
top-level expression. 

The visualization mode controls how the state space is ex- 
plored. The choices are simply running the program to com- 
pletion with a read-eval-print loop, visualizing a directed 
graph of the state space labeled by transitions, or with an in- 
teractive step-by-step exploration. The model mode selects 



12 



2013/1/14 



dbl.rkt - DrRacket 



dbt.rkr* [define ...)▼ Check Syntax Cl^ Debug Macro Stepper # \f Run J? Stop® 

#lang var eval term exact 
(rnoduLe dbl racket 
(require 'even?) 

(define dbl (A (f) (A (x) (f {f x))))) 
(provide/contract 
[dbl {(even? . -> . even?) . -> . (even? . -> . even?))])) 

(module fun racket 
(require 'even?) 

(provide/contract [fun (-> even? even?)])) 



Welcome to DrRacket . version 5.2.0.1 --2011 -10-1 5(7270c27/a) [3m]. 
Language: var [custom]; memory limit: 123 MB. 
' (. Cpred even?)) 

> ((dbl fun) 4) 
'{• (pred even?)) 

> ((dbl (A (x) 7)) 4) 

'(blame t dbl (A (x) 7) (pred even?) 7) 



Determine language troin s,..^ 



97.50 MfQ $ 



Figure 6. Interactive program verification environment 

whether to use the term or the machine as the underlying 
model of computation. 

Finally, the approximation mode selects what, if any, ap- 
proximation should be used. The exact mode uses no ap- 
proximation; allocation always returns fresh addresses and 
no widening is used for base values. The approx mode uses 
a default mode of approximation that has proved useful in 
verifying programs (discussed below). The user mode al- 
lows the user to provide their own custom metafunctions for 
address allocation and widening. 

6.1 Implementation extensions 

Our prototype includes significant extensions to the system 
as described above. 

First, we make numerous extensions in order to verify ex- 
isting Racket programs. For example, modules are extended 
to include multiple definitions and functions may accept zero 
or more arguments. The latter complicates the reduction re- 
lation as new possibilities arise for errors due to arity mis- 
matches. Second, we add additional base values and opera- 
tions to the model to support more realistic programs. Third, 
we make the implementation of contract checking and re- 
duction more sophisticated, to reduce complexity in sim- 
ple cases, improving running time and simplifying visual- 
ization. Fourth, we implement several techniques to reduce 
the size of the state space explored in practice, including ab- 
stract garbage collection 11321 . Abstract GC enables naive 
allocation strategies to perform with high precision. Addi- 
tionally, we widen contracted recursive functions to their 
contracts on recursive calls; this implements a form of in- 
duction that is highly effective at increasing convergence. 
Fifth, we add simpler rules to model non-recursive functions 
and non-dependent contracts. This brings the model closer to 
programmers expectation of the semantics of the language, 
and simplifies visualizations. Sixth, we include several more 



contract combinators, such as and/c for contract conjunc- 
tion; atom/c for expressing equality with atomic values; 
one-of /c for finite enumerations; struct/c for structures; 
and listof and non-empty-listof for lists of values.. 

Finally, we provide richer blame information when pro- 
grams go wrong, as can be seen in the screen shot of figure|6] 
Our system reports the full complement of information avail- 
able in Racket's production contract library, which reports 
the failure of dbl as: 

> ((dbl (A (x) 7)) 4) 

top-level broke (even? — > even?) — > 

(even? — > even?) 

on dbl; expected <even?>, given: 7 



Our prototype is available at github . com/samth/var . 



6.2 Verified examples 

We have verified a number of example programs, which fall 
into three categories: 

• small programs with rich contracts such as the example 
of insertion-sort from the introduction, where veri- 
fying contract correctness is close to full functional veri- 
fication; 

• tricky-to-verify programs with simple contracts such as 
the tautology checker described by Wright and Cartwright, 
where our tool proves not only that the function satisfies 
its boolean? -> boolean? contract, but also that it has 
no internal run-time type errors; and 

• several larger graphical, interactive video games devel- 
oped according to the program-by-design [ 16] approach 
that stresses data- and contract-driven program design. 

In the third category, we were able to automatically ver- 
ify contract correctness for non-trivial existing programs, 
including two, Snake and Tetris, developed for our under- 
graduate programming course. The programs make use of 
higher-order functions such as folds and maps and construct 
anonymous (upward) functions. Their contracts include fi- 
nite enumerations, structures, and recursive (ad-hoc) unions. 
Here is the key contract definition for the Snake game: 

(def ine-contract snake/c 
(struct/c snake 

(one-of/c 'up 'down 'left 'right) 
(non-empty-listof 

(struct/c posn nat? nat?)))) 

We are able to automatically verify these stated invariants, 
such as that snakes have at least one segment and that it 
moves in one of four possible directions. 

For these examples, we found that simply widening the 
results of recursive calls with abstract arguments is sufficient 
to ensure convergence in the semantics given an appropriate 
fine-grained module decomposition. All of our verified ex- 
amples are available with our implementation. 
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7. Related work 

The analysis and verification of programs and specifications 
has been a research topic for half a century; we survey only 
closely related work here. 

Symbolic execution Symbolic execution 11241 is the idea of 
running a program, but with abstract inputs. The technique 
can be used either for testing, to avoid the need to specify 
certain test data, or for verification and analysis. Over the 
past 35 years, it has been used for numerous testing and 
verification tasks. There has been a particular upsurge in 
interest in the last ten years HI [9), as high-performance 
SAT and SMT solvers have made it possible to eliminate 
infeasible paths by checking large sets of constraints. 

Most approaches to symbolic execution focus on abstract- 
ing first order data such as numbers, typically with con- 
straints such as inequalities on the values. In this paper, we 
present an approach to symbolic execution based on con- 
tracts as symbols, which scales straightforwardly to higher- 
order values. Despite this focus on higher-order values, the 
remembered contracts maintained by our system let us con- 
strain symbolic execution to feasible evaluations; using an 
external solver to decide relationships such as h V : C / is 
an important area of future work. 

Recently, under the heading of concolic execution [20, 
[36), symbolic execution has been paired with test generation 
to analyze software more effectively. We believe that we 
could effectively use our system as the framework for such 
a system, by nondeterminstically reducing abstract values to 
concrete instances. 

The only work on higher-order symbolic execution that 
we are aware of is by Thiemann l40l on eliminating redun- 
dant pattern matches. Thiemann considers only a very re- 
stricted form of symbols: named functions partially applied 
to arguments, constructors, and a top value. This approxi- 
mation is only sound for a purely functional language, and 
thus while we could incorporate it into our current sym- 
bolic model of Racket, further extensions to handle mutable 
state rule out the technique. It is unclear whether redundancy 
elimination would benefit from contracts as symbol. 

Verification of first-order contracts Over the past ten 
years, there has been enormous success verifying modu- 
lar first-order programs, as demonstrated by tools such as 
the SLAM and Spec# projects (3] [6] [14). These tools typ- 
ically operate by abstracting first-order programs in lan- 
guages such as C to simpler systems such as automata or 
boolean programs, then model-checking the results for vio- 
lations of specified contracts. 

However, these approaches do not attempt to handle 
of the higher-order features of languages such as Racket, 
Python, Scala, and Haskell. For instance, the boolean pro- 
gram abstraction employed by SLAM [2] is inherently first- 
order: variables can take only boolean values. Our system, 
in contrast, scales to higher-order language features. 



Despite the fundamental difference, there are important 
similarities between this work and ours. The systems all em- 
ploy nondeterminism extensively to reason about unknown 
behavior, and abstract the environment by allowing it to 
take arbitrary actions; as in the havoc statement in Boo- 
gie lfT4ll which we generalize to the havoc function for plac- 
ing higher-order functions in an arbitrary context. 

Additionally, we believe that the techniques used in these 
existing first-order tools could improve precision for first- 
order predicate checks in our system; exploring this is an 
important avenue for future work. 

Verification of higher-order contracts The most closely 
related work to ours is the modular set-based analysis based 
on contracts of Meu nier et al.| || 29l[30l and the static contract 
ch ecking of|Xu et al.| |43]|44). 

Meunier et al. take a program analysis approach, gen- 
erating set constraints describing the flow of values through 
the program text. When solved, the analysis maps source la- 
bels to sets of abstract values to which that expression may 
evaluate. Meunier's system is more limited than ours in sev- 
eral significant ways. 

First, the set-based analysis is defined as a separate se- 
mantics, which must be manually proved to correspond to 
the concrete semantics. This proof requires substantial sup- 
port from the reduction semantics, making it significantly 
and artificially more complex by carrying additional infor- 
mation used only in the proof. Despite this, the system is 
unsound, since it lacks an analogue of havoc. This unsound- 
ness has been verified in Meunier's prototype. 

Second, while our semantics allows the programmer to 
choose how much to make opaque and how much to make 
concrete, Meunier's system always treats the entire rest of 
the program opaquely from the perspective of each module. 

Third, our language of contracts is much more expres- 
sive: we consider disjunction and conjunction of contracts, 
dependent function contracts, and data structure contracts. 
Our ability to statically reason about contract checks is sig- 
nificantly greater — Meunier's system includes only the sim- 
plest of our rules for h V : C/. 

Finally, Meunier approximates conditionals by the union 
of its branches' approximation; the test is ignored. This 
seemingly minor point becomes significant when consider- 
ing predicate contracts. Since predicate contracts reduce to 
conditionals, this effectively approximates all predicates as 
both holding and not holding, and thus all predicate con- 
tracts may both fail and succeed. 

|Xu et al^[ B4l describe a static contract verification sys- 
tem for Haskell. Their approach is to compile contract 
checks into the program, using a transformation modeled 
on Findler and Felleisen [17|, simplify the program using 
the GHC optimizer, and examine the result to see if any con- 
tract checks are left in the residual program. In subsequent 
work, Xu ll43l applies this approach to OCaml, providing 
a formal account of the simplifier employed, and extend- 
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ing simplification by using an SMT solver as an oracle for 
some simplification steps. In both systems, if not all contract 
checks are eliminated by simplification, the system reports 
them as potentially failing. 



Our approach extends that of Xu et al. in three crucial 
ways. First, our symbolic execution-based approach allows 
us to consider full executions of programs, rather than just 



a simplification step. Second, Xu et al.| considers a signif- 



icantly restricted contract language, omitting conjunction, 
disjunction, and recursive contracts, as well as contracts that 
may not terminate, may fail, or are include calls to unknown 
functions. As we saw in section|4] these extensions add sig- 
nificant complexity and expressiveness to the system. Third, 
as with M eunier et al.f s work, the user has no control over 
what is precisely analyzed; indeed, Xu et al. inline all non- 
contracted functions. 

Blume and McAllester Q provide a semantic model of 
contracts which includes a definition of when a term is Safe, 
which is when it can never be caused to produce blame. 
We use a related technique to verify that modules cannot 
be blamed, by constructing the havoc context. However, we 
do not attempt to construct a semantic model of contracts; 
instead we merely approximate the run-time behaviors of 
programs with contracts. 

Abstract interpretation Abstract interpretation provides a 
general theory of semantic approximation 1 1 ] that relates 
concrete semantics to an abstract semantics that interprets 
programs over a domain of abstract values. Our approach is 
very much an instance of abstract interpretation. The reach- 
able state semantics of CPCF is our concrete semantics, with 
the semantics of SCPCF as an abstract interpretation defined 
over the union of concrete values and abstract values repre- 
sented as sets of contracts. In a first-order setting, contracts 
have been used as abstract values lfl4ll . Our work applies this 
idea to behavioral contracts and higher-order programs. 

Combining expressions with specifications Giving se- 
mantics to programs combined with specifications has a long 
history in the setting of program refinements [23 j. Our key 
innovations are (a) treating specifications as abstract values, 
rather than as programs in a more abstract language, (b) ap- 
plying abstract reduction to modular program analysis, as 
opposed to program derivation or by-hand verification, and 
(c) the use of higher-order contracts as specifications. 

Type inference and checking can be recast as a reduc- 
tion semantics [27], and doing so bears a conceptual resem- 
blance to our contracts-as-values reduction. The principal 
difference is that Kuan et al. are concerned with producing a 
type, and so all expressions are reduced to types before be- 
ing combined with other types. Instead, we are concerned 
with values, and thus contracts are maintained as specifica- 
tion values, but concrete values are not abstracted away. 

Also related to our specification-as-values notion of re- 
duction is Reppy's [34] variant of OCFA that uses "a more 
refined representation of approximate values", namely types. 



The analysis is modular in the sense that all module imports 
are approximated by their type, whereas our approach allows 
more refined analysis whenever components are not opaque. 
Reppy's analysis can be considered as an instance of our 
framework by applying the techniques of section [5] and thus 
could be derived from the semantics of the language rather 
than requiring custom design. 



Modular program analysis Shivers |38l , Serrano ||37ll , and 
Ashley and Dybvig [ 1 1 address modularity (in the sense of 
open-world assumptions of missing program components) 
by incorporating a notion of an external or undefined value, 
which is analogous to always using the abstract value • for 
unknown modules, and therefore allowing more descriptive 
contracts can be seen as a refinement of the abstraction on 
missing program components. 

Another sense of the words modular and compositional 
is that program components can be analyzed in isolation 
and whole programs can be analyzed by combining these 
component-wise analysis results. Flanagan |18| presents a 
set-based analysis in this style for analyzing untyped pro- 
grams, with many similar goals to ours, but without con- 
sidering specifications and requiring the whole program be- 
fore the final analysis is available. Banerjee and Jensen ]4]|5) 
and Lee et al. 11281 take similar approaches to type-based and 
OCFA-style analyses, respectively. 



Other approaches to higher-order verification Kobayashi 
et al.| ll25l l26l have recently proposed approaches to ver- 
ification of temporal properties of higher-order programs 
based on model checking. This work differs from ours in 
four important respects. First, it addresses temporal prop- 
erties while we focus on behavioral properties. Second, 
it uses externally-provided specifications, whereas we use 
contracts, which programmers already add to their pro- 
grams. Third, and most importantly, our system handles 
opaque components, while model-checking approaches are 
whole-program. Fourth, it operates on higher-order recur- 
sion schemes, a computational model with less power than 
CPCF, the basis of our development. 

Rondon et al. Il35l present Liquid Types, an extension to 
the type system of OCaml which incorporates dependent re- 
finement types, and automatically discharges the obligations 
using a solver. This naturally supports the encoding of some 
uses of contracts, but restricts the language of refinements 
to make proof obligations decidable. We believe that a com- 
bination of our semantics with an extension to use such a 
solver to decide the h V : C S relation would increase the 
precision and effectiveness of our system. 

8. Conclusion 

We have presented a technique for verifying modular higher- 
order programs with behavioral software contracts. Con- 
tracts are a powerful specification mechanism that are al- 
ready used in existing languages. We have shown that by us- 
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ing contracts as abstract values that approximate the behav- 
ior of omitted components, a reduction semantics for con- 
tracts becomes a verification system. Further, we can scale 
this system both to a rich contract language, allowing expres- 
sive specifications, as well as to a computable approximation 
for automatic verification derived directly from our seman- 
tics. Our central lesson is that abstract reduction semantics 
can turn the semantics of a higher-order programming lan- 
guage with executable specifications into a symbolic execu- 
tor and modular verifier for those specifications. 
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A. Auxiliary definitions 

This section presents the full definitions of the type systems, 
metafunctions and relations, and machine transitions from 
the earlier sections of the paper. 

A.1 Types for PCF with Contracts 

The type system for CPCF is entirely conventional and taken 
from Dimoulas and Felleisen lfl2l . 

CPCF Type System 



Concrete Operations 



T h n : N 
T h zero? : N 



T h tt : B 

~B 



T h f f : B 



T h blame£ : T 



T h true? : B -> B 

T(X) = T 
T h X : T 

T h E' : T 



TV- EE' :T' 



T;X:T\-E:T' 
T h \X:T.E :T^T' 



T;X :TY- E :T 
T h fiX:T.E 



TV- E :B TV- E-l-.T T h E 2 : T 
T h if E Ex E 2 : T 

rhC:con(T) T \- E : T rh£:T^B 



T h mon(C, E) : T TV- flat(-B) : con(T) 

TV- C : con(T) T h D : con(T') 

T h C^XT-.X.D : con(T ->• T 7 ) 
i i 

A.2 Types for Symbolic PCF with Contracts 

The type system extends straightforwardly to handle abstract 
values labeled with their types. 

SCPCF Type System 

i 1 

T h « T : T 

TV-.U-.T C e C =>■ T V- C : con(T) 
TV-U/C :T 

i i 

A.3 Operations on concrete values 

The following rules define S for concrete values, we refer 
to the subset of 5 that relates only concrete values as S. We 
assume V, U are concrete here. 

(We have omitted the rules producing blame for arity 
mismatch and undefined cases.) 



(addl, n, n + 1 
(car, (17,10,17 
(cdr,{U,V),V 
(+, n, m,n + m 
(=,n,n,tt 
n =/= m (=,n, m,ff 

(cons, U, V, (U, V) 
(nat?, n, tt 
V £M (nat?,V,ff 
V G {tt, fF} =>> (bool?,F,tt 
V^{tt,ff} => (bool?,F,ff 
(empty?, empty, tt 

V 7^ empty (empty?, V, ff 

(cons?, (U,V),tt 

V ^ (U,V) (cons?,V",ff 

(proc?,(A x y.£),tt 
V^QixV-E) => (proc?,V,ff 
(false?, ff, tt 
7^ff (false?, V;ff 



e 5 

g J 

€ * 

g 5 

G 6 

G J 

G <5 

G <5 

G J 

G J 

G <5 

G 5 

G J 

G J 

G J 

G <5 

G 5 

G J 

G J 



Value proves or refutes base predicate h V : o? / and 



1 

<J(o?, 10 3 tt 


i 

h C : o?/ 


h y : o? / 


h V/CU{C} : o?/ 


d(o?, V) 9 ff 


h C : o?X 


h y : o? * 


h F/CU{C} : o?X 



Figure 7. Provability relations 

A.4 Operations on symbolic values 

The defintion of 5 on symbolic values is presented in fig- 
ure [8] We assume V, U are abstract. Further relations on 
abstract values are presented in figures |9} [7] and[TT| 

A.5 Machine 



The full CESK machine is given in figure 10 
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Basic reductions 



(E,p,K,a) i — > (E',g, t,<r) 



(EE' e ,p, K ,a) 

(if EEi E^, p,K,a) 

(OE e ,p,n,a) 

(O E E' e ,p,K,a) 

(X,p,K,a) 

(V,p,ar e (E,g,k),a) 

(V,p,fn e ((XX.E),g,k),a) 

(V,pM(U,g,k),a) 

(V,p, \f{E,E',g,k),a) 

(V,p, \f{E,E',g,k),a) 

(V,p,op e (0,a),a) 

(((17, e),(V,p)),0,op(car, a), a) 

(((£/, e),(V,p)),0,op(cdr, a), <r) 

(V,p,op\*(0,E,g,k),a) 

(V,p,opr £ (cons, U, g, a), a) 

(V,p,opr l (0, U,q, a), a) 

(blame^, p, k, a) 



(E,p,3T t (E',p,k),cj[k i y k\) 
(E,p, \f(E[,E 2 ,p,k),a[k^ k]) 
(E, p, op^(0, fc), er[fc i y k]) 
(E, p, op\ e {0, E', p, fc), cr[fc i y «]) 
(V,g,n,a) 

(E,g,fn l {V,p,k),a[k^K]) 

(E,g[X i y a},K,cr[a^ {V,p)]) 

(blame^,0, mt, 0) 

(E, g, k, a) 

(E',g,n,a) 

(A,<D,K,a) 

(U, g, k, a) 

(V,p,K,o) 

(E,g,opr e (0,V,p,k),a) 
(((U,g),(V,p)),<D,K,a) 
(A,<D,K,a) 
(blamed, 0, mt,0) 



if(V,g)ea( P (X)) 



if <5(proc?,£7) 3 ff 
if 5(false?,V) 3 ff 
if 5(false?,V) 3 tt 
if 5{O l ,V) 3 A 



Module references 



(f 9 ,P, K,a) 



(V,0,K,<7> 

(y,0,chk{' 9 (C,0,fc),a[fc^ «]) 



if(f f ,V) e A(M) 

if (/ / S,9( 



mon^(C, V)) e A(M) 



Contract checking 



(mon^(C,£;),p,K,a) 
(y,p,chk{' 9 (C, ft fc),tr) 



(V, p, fn £ ((AI.mon^(fl, (a mon^ (C, X)))), g, k),a) 



g,f, 



(E,p,chk f h ' a (C,p,k),a[k^K}) 
(V, p, fn (U, g, fc'), a[fc' ^ \f(V/{C}, blame^ , p, fc)]) 
where C is flat and U = FC(C, V) 



(V, p, c\\k^ 9 (C^XX.D, g, fc), a) 

(V, p, chk^ 9 [C^XX.D, g, fc), a) 

(V,p,chkl' 9 (CAD,g,k),a) 

(V,p,cbk f h ' 9 (CVD,g,k),a) 

(V, p, chk-or£ s ([7, g,CvD, p', fc), a) 
(V,p,chk-orl' 9 (U,g,CvD,p',k),a) 



(V, p, chkf(C, g, fc'), a[k' ^ fn f (U, g', fc"), 

fc" i ^ chk{' s (A^[^ •-)■ 6],fc), 
&->(V,p)]> 

where (U, g') G cr(a) 

((XX.mon f h ' g (D, (a mon»' / (C, X)))), e , i, a[a ^ (V, p)]) 

if c5(proc?, V) 3 tt 

(blame£,0,mt,0) if <5(proc?, V) 3 ff 

(v, P , chk^(c, e , ^ chk£ fl (A ft fc)]) 

(V, p, ar((7, e , i), *[i ^ chk-or£»(V, p,C~V D, g, fc)]) 
where [7 = FC(C) 

(C//{C}, k, a) if (5(false?, V) 3 ff 

(£7,£>,chk{' 9 (/J,p',fc),cr) if 5(false?, V) 3 tt 



Abstract values 



(V,pM(*/C,g,k),a) 

(V,p, beg\n(E,g,k),a) 

(•/CU{CiVC 2 },p,K,a) 

(./CU{ M XC},p, K ,a) 



(E, p, begin(C7, g, fc), a) if c5(proc?, »/C) 9 tt 

where E = AMB({tt, demonic(/\ dom(C), V)}) and U = »/rng(C) 

(E, g, k, a) 

(•/CU{Ci},p,K,a) 

(./Cb){[fiX.C/X}C},p, K ,a) 



Higher-order pair contract checking 


(V,p,chk{' 9 ((C,D),g,k),a) 
(V,p,cbk f h > 9 ((C,D),g,k),a) 

(V, p, chk-cons£' 5 (C, g, U, p', fc), a) h- 


— > (blame£,0,mt,0) if £(cons?, V) 3 ff 

(J7, p, op(car, i), cr[i M> chk/ l ' 9 (C, £>, fc'), fc' M> chk-cons{' 9 (/J, [7, p. 
if 5{consl\V) 3 tt, where U = F/{flat(cons?)} 

-> (C7, p', op(cdr, i), cr[2 chk{' 9 (C, fc'), fc' opr(cons, V, p, fc)]) 


i 

,fc)J) 

2013/1/14 
i 



Contract proves or refutes base predicate h C : o? / and 

hC:o?X 



h y : o?/ (o?,V,tt 
h V : o?X => (o?,V,ff 
hV:o?? => (o?, V, •/flat(bool?) 
hV:nat?S => (addl, V, •/fiat(nat?) 

hy:nat?X => (addl £ , V, blamef ddl 
hy:nat?? => (addl,V,»/flat(nat?) 

A (addl £ ,F,blamef ddl 
hy:cons?/ (car, V,7Ti(y) 
h V : cons?/ (car £ , V, blame^ 
hy icons?? (car, V,7n(F) 
A (car £ , V, blame^ ar 
icons?/ =► (cdr,y,7r 2 (y) 
h V : cons?X =► (cdr £ , V, blamef: dr 
icons?? =► (cdr,y,7r 2 (y) 
A (cdr , V, blame^ dr 
h V : nat?/Ah U : nat?/ => 
(+,V,C7>/flat(nat?) 
hF:nat?X (+, V, [7, blame^ 
h77:nat?X (+, V, [7, blame^ 
h y : nat??A h U : nat?? 

(+,V,C7>/flat(nat?) 
hF:nat?? => (+, V, [7, blame^ 
h 77: nat?? => (+, V, £7, blame^. 
h V : nat?/ Ah 7/ : nat?/ => 
(=, V, U, •/flat(bool?) 
hy : nat?X => (=, V, f7, blamel 
h77:nat?X => (=, V, U, blame! 
h y : nat??A h 77 : nat?? 

(=, y, L7, •/flat(bool?) 
hF:nat?? (=, V, t7, blame! 
hFiinat?? (=,y 77, blame! 

(cons, V, 77, (y, [7) 



G 5 

G S 

G S 

G S 

G <5 

G <5 

G <5 

G 6 

G (5 

G <5 

G <5 

G <5 

G 6 

G <5 

G (5 

G <5 

G <5 

G S 

G (5 

G <5 

G <5 

G <5 

G 6 

G <5 

G <5 

G <5 

G (5 

G S 



h flat(false?) : bool?/ 

h C^XX.D : proc?/ 

h C : o? / h £> : o? / 
h CV7J : o?/ 

o? proc? 



h (C,D) : cons?/ 

h flat(o?) : o?/ 

h C : o?/orh L> : o?/ 
h CAD : o?/ 

o? ^ cons? 



h C^AXD : 0?* 
hC:o?X h£>:o?X 



h (C,D) : o?X 
h C : o?Xor h D : o?X 



hCVL>:o?X hCAL>:o?X 

h \fiX.C/X]C : o?X 
h ^XC : o?X 

o? 7^ o?' {o?, o?'} ^ {false?, bool?} 
h flat(o?') : o?X 

Figure 9. Contract relations 



Value proves or refutes contracth V : C / and h V : C X 
i 1 



C* G C 



h V/C : CV 
h V : o?X 



h V : o?/ 
h V : flat(o?)/ 

ht/:CX h V : DX 



hV:flat(o?)X hy : CVL>X 

h V : CXor h V : DX 
h y : CA£>X 

C G {flat(cons?),flat(nat?),flat(false?),flat(bool?)} 
h V : proc? / 

h V : CX 
hy:proc?X hy:cons?X 



Figure 8. Basic operations on abstract values 



hV:C^AI.D/ hy : (C,D)X 

h tti(V) : CXor h tt 2 (V) : DX 
h V : (C,D) X 

h C7 : CXor h V : DX h V : \p,X.C jX\C X 



\-(U,V) : (C,D)X 



h V : [iX.CX 



Figure 11. Provability relations 
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B. Proofs 

The approximation relation on expressions, modules, and 
programs is formalized below. We show only the impor- 
tant cases and omit the straightforward structurally recur- 
sive cases. We parametrize by the module context of the ab- 
stract program, to determine the opaque modules; we omit 
this context where it can be inferred from context. 



Approximates 



P C Q, M E M TV, and E E M E' 



V C • 



mon(C,P) E • • C 



(AXmon(A (V mon(C,X)))) E • • C^XX.D 
VCV' hV:CV 



V-C^V V-CQV' C V E V ■ C 

mon(C,P)EP' 



NQgM E'Q^E 

TVP' C ME mon(C, E) E mon(C, E') 

(module f C •) G Mor/ = f 
blame^ E M 7? 

(module /C») G M 

(module /CV) E^ (module / C •) 
i i 

We lift E to evaluation contexts £ by structural extension. 
We lift E to contracts by structural extension on contracts 
and E on embedded values. We lift E to vectors by point- 
wise extension and to sets of expressions by point-wise, 
subset extension. 

With the approximation relation in place, we now state 
and prove our main soundness theorem. 

Lemma 2. IfV E M U, then 5(0, V) E^ 5(0, U). 



Proof. By inspection of 6 and cases on O and V. 



□ 



Lemma 3. If E E M E' and V E M u > then [V/X]E E M 

y/x'E'. 

Proof. By induction on the structure of E and cases on the 
derivation of E E^g E'. □ 

Lemma 4. 7/C E M A f^en fc(C) E M p c(A). 

Proof. By induction on the structure of C and cases on the 
derivation of E Ea P' and the defintion of FC. 



□ 



Lemma 5. Let E = fc(C), ?/zen 

7. i/H V : CV andV Em ^ then E u ' — » A 3 
2. //h V : CX ant/ V E M P, f/zen 75 U i — » A □ f f . 

Proof. By induction on the structure of C and cases on the 
derivation of P E M V and the defintion of FC. □ 



Lemma 6 



If P i — > P', P' ^ M blame^ and P E Q, 
Q' one/ P' E <2'- 



Proof. We split into two cases. 
Case(l): 

P = M£[P] i — > M £[E'] 
Q = TV E'[E'] i — ► TV£'[P"] 

where £ £'. We reason by cases on the step from E to 
E'. 

• Case: 75 = P and (module / C E') G ill 
If / is transparent in TV, then (module / C 75') G TV and 
we are done by simple application of the reduction rules 
for module references. Otherwise, / ^ g and thus 



E' 



(C,V) P" = mon^(C7,./{C7}), 



/ 



but now E' E^v E ", since P' E^ •/{<?}• 

• Case: mon(C^AXD,F) i — ► (AXmon(fl, (P mon(C, X)))) 
where <5(proc?, 7) 3 tt and P = V/{C^\X.D}. 

Since P' is a redex, by E we have E' = mon(C" M> 
\y.D', V'), where V ^$V. By lemmag <5(proc?, V) 3 
tt. So E" = (Ay.mon(A, (P' mon(C',y)))) where 
P' = V /{C'^Xy.C} Ew V/{Ch-AX£>} and thus 
P' E,v E" . 

• Case: Vi i — > blame^, where £(proc?, Vi) 9 ff. 

By E, we have E' = U x U$ and U l E^ v i- B Y lemmaH 



<5(proc?, Pi) 9 ff, hence Pi P| 



blame. 



• Case: Vi i — > P', where <S(proc?, Vi) 9 tt. 

By E, we have P' = Pi P| and P, E^ Vi- B Y lemma|2] 
<5(proc?, Pi) 9 tt. 

Either Vi and Pi are structurally similar, in which case 
the result follows by possibly relying on lemma [3] or 
Vi = {\ y X.E Q )/C and Pi = m/C. There are two pos- 
sibilities for the origin of V\. either it was blessed or 
it was not. If Vi was not blessed, then C contains no 
function contracts, implying C contains no function con- 
tracts, hence E" = •, and the result holds. Alterna- 
tively, Vi was blessed and C contains a function contract 
C i y \X.D. But but by the blessed application rule, we 
have £ = £ i [mon([V 2 / /-X']P / , [ ])]> thus by assumption 
£' = ^[mon([P^/X]P',[])], implying [V^/X}D E 
[U' 2 / X]D' , finally giving us the needed conclusion: 

£ l [ m on([Vi/X]D,E')] Ejv £[[^on{[U' 2 / X]D\ E'% 

where E" = •/{[U 2 /X]D / \ C'^XX.D 1 G C'}. 

• Case: mon(C, V) i — > E' where C is flat. 

If h V : d, then the case holds by use of lemma |4] If 
h V : C S , then the case holds by lemma gl). If h V : 
CX, then the case holds by lemma[5|2). 

• The remaining cases are straightforward. 
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Case (2): 

P = M£ 1 [£ 2 [E]]^M£ 1 [£ 2 [E']} 
Q = N£[[£' 2 [E']) 

where £\ is the largest context such that £\ £[ but 
£l%fj £'2- 

In this case, we have £ 2 [E] \—^£ 2 [E'], but since £2%^ 
£' 2 , this must follow by one of the non-structural rules for C, 
all of which are either oblivious to the contents of E and E', 
or do not relate redexes to anything. □ 

Lemma 7. If there exists a context £ such that 

M (module fCV) £[f] 1 — » blame£, 

then 

M (module / C V) (havoc /) 1 — » blame£. 



Proof. If there exists such an £, then without loss of gener- 
ality, it is of some minimal form T> in 

V = []\(VV) \ (car V) | (cdr£>), 

but then there exists a V equal to T> with all values replaced 
with • such that M T)'\V] 1 — » blame^. This is because 
at every reduction step, replacing some component of the 
redex with • causes at least that reduction to fire, possibly 
in addition to others. Further, by inspection of havoc, if 
M V'[V] 1 — » blame^, then M (havoc V) 1 — » blame£. 

□ 

Theorem^ By the definition of eval, we have P 1 — » A. 
Let the number of steps in P 1 — » A be n. There are two 
cases: either A — V, or A = b\ame e e ,. If A = V, then we 
proceed by induction on n and apply lemma|6]at each step. 

If A = blame^, then there are two possibilities. If £ is the 
name of an opaque module in M or if I = f , then A A' 
immediately. If I = f is the name of a concrete module 
in M, then havoc / 1 — » A by lemma [7] and therefore 
A 6 eval(Q) by the definition of eval. □ 

Theorem\3\ We reason by case analysis on the transition. In 
the cases where the transition is deterministic, the result fol- 
lows by calculation. For the the remaining non-deterministic 
cases, we must show an abstract state exists such that the 
simulation is preserved. By examining the rules for these 
cases, we see that all hinge on the abstract store in a soundly 
approximating the concrete store in er, which follows from 
the assumption that a(a) C a. □ 

Theorem^ The state-space of the machine is non-recursive 
with finite sets at the leaves on the assumption that addresses 
and base values are finite. Hence reachability is decidable 
since the abstract state-space is finite. □ 



22 



2013/1/14 



